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Abstract — This  paper  introduces  a  new  stochastic  global  opti¬ 
mization  method  targeting  protein-protein  docking  problems,  an 
important  class  of  problems  in  computational  structural  biology. 
The  method  is  based  on  finding  general  convex  quadratic  underes¬ 
timators  to  the  binding  energy  function  that  is  funnel-like.  Finding 
the  optimum  underestimator  requires  solving  a  semidefinite  pro¬ 
gramming  problem,  hence  the  name  semidefinite  programming- 
based  underestimation  (SDU).  The  underestimator  is  used  to  bias 
sampling  in  the  search  region.  It  is  established  that  under  appro¬ 
priate  conditions  SDU  locates  the  global  energy  minimum  with 
probability  approaching  one  as  the  sample  size  grows.  A  detailed 
comparison  of  SDU  with  a  related  method  of  convex  global  un¬ 
derestimator  (CGU),  and  computational  results  for  protein-protein 
docking  problems  are  provided. 

Index  Terms — Linear  matrix  inequalities  (LMIs),  optimization, 
protein-protein  docking,  semidefinite  programming,  structural  bi¬ 
ology. 

I.  Introduction 

THIS  work  is  motivated  by  a  fundamental  and  challenging 
problem  in  computational  structural  biology.  Proteins  in¬ 
teract  with  each  other  and  with  other  chemical  entities  in  the  cell 
to  form  complexes  consisting  of  two  (or  more)  macromolecules. 
Protein-protein  interactions  play  a  central  role  in  metabolic  con¬ 
trol,  signal  transduction,  and  gene  regulation.  Proteins  also  in¬ 
teract  with  a  large  number  of  small  molecules  that  serve  as  sub¬ 
strates,  inhibitors,  or  cofactors  in  metabolic  reactions,  as  well  as 
with  external  compounds  acting  as  drugs  in  modulating  biolog¬ 
ical  behavior.  Determining  the  three-dimensional  (3-D)  structure 
of  a  complex  from  the  atomic  coordinates  of  two  or  more  inter¬ 
acting  molecules  is  known  as  the  molecular  docking  problem. 
Experimental  techniques  -x-ray  crystallography  or  nuclear  mag- 
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netic  resonance  (NMR)-  can  provide  such  3-D  structures  but 
are  time-consuming,  expensive,  and  not  universally  applicable  as 
many  molecular  complexes  are  short-lived  and  exist  only  under 
well  defined  cellular  conditions.  As  a  result,  solving  these  prob¬ 
lems  computationally  is  critical  and  has  attracted  a  lot  of  attention. 

Docking  two  proteins  can  be  seen  as  a  control  problem  as  one 
has  to  control  one  of  them  as  it  approaches  and  docks  with  the 
other;  this  has  parallels  with  problems  arising  in  robot  arm  ma¬ 
nipulation  [1],  [2].  From  a  static  point  of  view,  thermodynamic 
principles  imply  that  proteins  bind  in  a  way  that  minimizes  the 
Gibbs  free  energy  of  the  complex.  These  energy  functions  have 
multiple  deep  funnels  and  a  huge  number  of  local  minima  of  less 
depth  that  are  spread  over  the  domain  of  the  function  (see  Fig.  1). 
This  “complexity”  of  the  energy  landscape  is  perhaps  best  ar¬ 
ticulated  in  the  Levinthal  paradox  [3]:  The  probability  that  na¬ 
ture  could  find  the  native  3-D  structures  (or  “conformations”) 
by  random  search  seems  uncomfortably  close  to  zero.  More  re¬ 
cently  though,  it  was  recognized  that  the  funnel-like  shape  of 
the  energy  function  may  explain  why  proteins  bind  with  other 
molecules  very  fast  and  in  so  doing  they  may  follow  multiple 
pathways  on  the  energy  surface  (see  [4]).  In  this  paper,  we  intro¬ 
duce  a  new  optimization  method  for  solving  docking  problems. 
It  is  exactly  this  funnel-like  shape  of  the  energy  function  we  set 
out  to  exploit. 

In  general,  the  energy  functions  include  forces  that  act  in  dif¬ 
ferent  space- scales,  resulting  in  multi-frequency  behavior,  and 
leading  to  a  huge  number  of  local  minima  caused  by  high-fre¬ 
quency  terms.  Even  though  some  global  optimization  methods 
(see  [5])  have  been  useful  in  the  related  problem  of  protein 
folding,  it  appears  that  only  simulated  annealing  has  been  con¬ 
sistently  used  in  docking.  Specifically,  Baker  et  al.  [6]  employ 
simulated  annealing  on  a  progressively  improving  approxima¬ 
tion  of  the  energy  function.  This  has  been  fairly  effective  but  it 
is  computationally  expensive  and,  thus,  applicable  to  relatively 
small  problem  instances.  A  number  of  other  recent  approaches 
have  attempted,  with  some  success,  to  use  the  funnel-like  shape 
to  guide  the  global  search  to  the  vicinity  of  the  global  minimum 
[e.g.,  the  semiglobal  simplex  (SGS)  by  Dennis  and  Vajda  [7], 
and  SmoothDock  by  Camacho  and  Vajda  [8]].  Of  most,  rele¬ 
vance  to  this  paper  is  the  convex  global  underestimator  (CGU) 
method  of  Phillips  et  al.  [9]  where  convex  canonical  quadratic 
underestimators  are  used  to  approximate  the  envelope  spanned 
by  the  local  minima  of  the  energy  function.  It  has  been  shown 
that  CGU  does  not  perform  well  in  some  cases  and  that  its  per¬ 
formance  deteriorates  as  the  dimension  of  the  search  space  in¬ 
creases  [7] .  We  contend  that  a  critical  reason  for  this  poor  per- 
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Fig.  1.  (Left)  Protein  docking  funnels  and  the  operation  of  the  docking  method  SmoothDock  [8].  (Right)  A  funnel-like  shaped  function  and  a  quadratic  function 
underestimating  the  surface  spanned  by  the  local  minima. 


formance  is  the  restricted  class  of  underestimators  used  in  CGU. 
This  restriction  amounts  to  a  lack  of  flexibility  in  capturing  the 
overall  shape  of  the  funnels  and  hence  an  inability  to  locate 
promising  regions. 

We  use  the  same  strategy  of  using  quadratic  convex  functions 
to  underestimate  the  envelope  spanned  by  the  local  minima  of 
the  energy  function.  However,  we  consider  the  class  of  general 
convex  quadratic  functions  for  underestimation.  In  this  case, 
given  a  finite  set  of  local  minima,  finding  the  optimal  underes¬ 
timator  amounts  to  solving  a  semidefinite  programming  (SDP) 
problem,  hence  the  term  SDU.  The  dual  to  this  problem  in¬ 
volves  linear  matrix  inequalities  (LMIs)  which  have  found  ubiq¬ 
uitous  use  in  control  theory  lately.  Another  key  contribution  of 
the  new  method  we  introduce  is  the  use  of  a  biased  exploration 
strategy  that  is  guided  by  the  underestimator.  We  show,  theoret¬ 
ically  as  well  as  experimentally,  that  SDU  outperforms  CGU, 
often  significantly.  Moreover,  we  show  that  SDU  will  indeed 
(probabilistically)  converge  to  the  global  minimum  when  ap¬ 
plied  to  a  general  class  of  funnel-shaped  functions  as  the  number 
of  local  minima  used  for  underestimating  the  energy  function 
grows.  This  suggests  that  SDU  has  the  potential  to  be  useful 
in  solving  molecular  docking  problems.  A  number  of  numer¬ 
ical  results,  some  involving  real  instances  of  the  protein-protein 
docking  problem,  reinforce  this  conclusion. 

The  rest  of  this  paper  is  organized  as  follows .  Section  II  presents 
background  material  on  docking  and  surveys  various  approaches. 
Section  III  presents  the  SDU  method.  Comparisons  with  CGU 
are  in  Section  IV.  SDU’s  convergence  properties  are  discussed 
in  Section  V.  Numerical  results  for  a  set  of  test  functions  are  in 
Section  VI.  Some  results  on  docking  proteins  are  discussed  in 
Section  VII.  We  end  with  conclusions  in  Section  VIII. 

Notational  Conventions 

Throughout  this  paper,  all  vectors  are  assumed  to  be  column 
vectors.  We  use  lower  case  boldface  letters  to  denote  vectors  and 
for  economy  of  space  we  write  x  =  (x\ , . . . ,  xn)  for  the  column 
vector  x.  x'  denotes  the  transpose  of  x,  0  the  matrix  (or  vector) 
of  all  zeroes,  e  the  vector  of  all  ones,  I  the  identity  matrix,  and 
ei  the  i\h  unit  vector.  For  any  vector  x  we  write  ||x||  i  for  the  LI 
norm,  i.e.,  ||x||i  =  \xiV  and  llxll  for  the  Euclidean  norm. 

We  use  upper  case  boldface  letters  to  denote  matrices  and  write 
A  =  for  the  matrix  A  with  (i,  j)th  element  equal  to 


Aij.  We  denote  by  diag(x)  the  diagonal  matrix  with  elements 
Xx, . . .  y  xn  in  the  main  diagonal  and  zeroes  elsewhere.  We  also 
denote  by  diag(A,  B)  the  block  diagonal  matrix  with  matrices 
A  and  B  in  the  main  diagonal  and  zeroes  elsewhere.  We  define 

n  n 

F,Y=E£F«yvi  (d 

i=l  j= 1 

which  is  the  Frobenius  inner  product  of  matrices.  Finally,  for 
any  event  S,  1{<S}  denotes  its  indicator  function. 

II.  Preliminaries  and  Background 

In  this  section,  we  provide  some  biological  background  on  the 
docking  problem  in  order  to  establish  the  appropriate  context  for 
the  optimization  methodology  we  introduce. 

State-of-the-art  protein-protein  docking  approaches  start  by 
identifying  conformations  with  good  surface/chemical  com¬ 
plementarity  where  the  two  interacting  proteins  are  considered 
rigid  bodies.  Accordingly,  the  problem  of  forming  a  com¬ 
plex  by  docking  one  protein — the  ligand — to  the  other — the 
receptor — has  parallels  with  problems  arising  in  rigid  body 
motion  and  robot  arm  manipulation  (see,  e.g.,  Murray  et  al.  [1], 
Gwak  et  al.  [2]).  In  particular,  one  has  to  control  the  ligand  as 
it  approaches  and  docks  with  the  receptor.  The  ligand’s  relative 
position  lies  in  the  Euclidean  group  of  rigid-body  motion, 
SE{ 3),  which  is  the  semidirect  product  of  U3  (translations) 
and  the  rotation  group  SO( 3).  The  dynamics  of  the  ligand’s 
approach  to  the  receptor  play  an  important  role  in  nature  but  are 
hard  to  model  (and  computationally  prohibitive  to  simulate)  in 
the  time-scale  they  evolve.  The  problem  is  further  complicated 
by  the  fact  that  proteins  are  not  rigid  bodies  and  experience 
conformational  changes  as  they  bind  (mostly  the  side-chains 
on  the  interface).  The  protein-protein  docking  literature  (e.g., 
[8]  and  the  references  therein)  is  predominantly  treating  the 
docking  problem  as  a  static  problem  and  this  is  the  view  we 
will  initially  adopt  for  this  problem. 

Protein-protein  docking  approaches  initially  screen  confor¬ 
mations  by  various  measures  of  surface  complementarity  which 
can  be  efficiently  computed  using  fast  Fourier  correlation  tech¬ 
niques  (FFTs).  However,  when  docking  unbound  conformations 
these  methods  yield  many  “false  positives”  (i.e.,  conformations 
that  have  good  score  but  are  far  from  the  native  one).  An  effec¬ 
tive  approach  used  by  Comeau  et  al.  [10]  is  to  cluster  the  confor- 
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mations  obtained  by  the  FFT  filtering  based  on  how  close  they 
are.  (A  common  metric  of  distance  between  conformations  is 
the  root  mean  square  deviation  (RMSD)  of  their  atomic  coor¬ 
dinates).  Clustering  identifies  low  free  energy  regions  (funnels) 
one  needs  to  further  explore.  We  will  be  referring  to  the  problem 
of  exploring  a  funnel  as  the  medium-range  problem.  The  pro¬ 
posed  SDU  method  aims  at  solving  this  problem  by  exploring 
the  funnel  shape  of  the  energy  function.  Next  we  review  key 
properties  of  the  free  energy  functions. 

A.  Free  Energy  and  its  Mathematical  Properties 

Free  Energy  Models: 

At  fixed  temperature  and  pressure,  a  complex  of  two 
molecules  adopts  the  conformation  that  corresponds  to  the 
lowest  Gibbs  free  energy  of  the  system  that  includes  the  com¬ 
ponent  molecules  and  the  solvent — usually  water — surrounding 
them.  Thus,  in  docking  calculations  the  natural  target  function 
to  minimize  is  an  approximation  of  the  Gibbs  free  energy, 
GRL ,  of  the  receptor-ligand  complex,  or  that  of  the  binding 
free  energy,  AG.  In  particular,  AG  =  GRL  —  GR  —  GL ,  where 
Gr  and  Gl  are  the  free  energies  of  the  receptor  and  ligand, 
respectively,  and  both  GR  and  GL  are  independent  of  the 
conformation  of  the  complex;  thus,  minimizing  GRL  reduces 
to  minimizing  AG. 

We  use  free  energy  evaluation  models  that  combine  molec¬ 
ular  mechanics  with  continuum  electrostatics  and  empirical  sol¬ 
vation  terms  [11].  For  protein  docking  applications,  the  binding 
free  energy  can  be  approximated  as 

AG  =  A  Eejec  +  A  Gdes  —  TASsc  +  AEvdw  (2) 

where  AE'eiec  is  the  change  in  electrostatic  energy  upon 
binding,  AEvdw  is  the  change  in  van  der  Waals  energy,  and 
AG *des — the  desolvation  free  energy — is  associated  with  the 
removal  of  the  solvent  from  the  interacting  surfaces  of  the  pro¬ 
tein  and  ligand,  where  the  *  denotes  that  it  includes  the  change 
in  the  solute-solvent  van  der  Waals  energy.  The  entropy  term, 
— TASsc ,  accounts  for  the  decrease  in  entropy  experienced  by 
the  interface  side  chains  upon  binding. 

Multifrequency  Behavior: 

The  free  energy  function  can  be  regarded  as  the  sum  of  three 
components  with  different  frequencies.  First,  the  following 
“smooth”  component 

A Gs  =  AEq\qC  +  A Gdes  ~  TASsc  (3) 

changes  relatively  slowly  along  any  reaction  path,  where  the  de¬ 
solvation  free  energy  AGdes  does  not  include  the  solute- sol¬ 
vent  van  der  Waals  term.  The  full  contribution  of  the  desolva¬ 
tion  forces  is  estimated  by  the  atomic  contact  potential  (ACP) 
[12],  where  the  ACP  potential  includes  the  self-energy  change 
on  desolvating  charge  or  polar  atom  groups  and  the  side  chain 
entropy  loss,  i.e.,  A GAcp  =  A Gdes  -  TASsc.  A GAcp  is 
much  less  sensitive  to  structural  perturbations  than  AEvdw  and 
AEe\ec.  The  electrostatic  energy  AEe\ec  changes  with  an  in¬ 
termediate  frequency,  and  the  frequency  of  change  is  very  high 
for  AGvdw-  Thus,  the  van  der  Waals  component  is  essentially 
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a  high  frequency  noise,  because  on  its  own  it  carries  very  little 
information  about  the  distance  from  the  native  state. 

Funnel-Like  Shape: 

As  discussed  in  Section  I,  the  concept  of  the  funnel-shaped 
energy  landscapes  (see  Fig.  1)  has  had  a  significant  impact  on 
understanding  protein  folding  and  docking.  This  behavior  has  a 
physical  explanation.  Due  to  the  term  A£Vdw,  the  free  energy 
function  defined  by  (2)  exhibits  high  sensitivity  to  structural  per¬ 
turbations.  In  addition,  in  any  conformation  without  steric  over¬ 
laps  the  van  der  Waals  energy  term  is  favorable,  resulting  in  a 
number  of  local  minima.  Since  most  random  perturbations  yield 
overlaps,  the  local  minima  are  separated  by  high  energy  barriers. 
In  the  conformations  without  overlaps,  the  solute-solute  and 
solute-solvent  interfaces  are  equally  well  packed,  and  thus  the 
intermolecular  van  der  Waals  interactions  in  the  bound  state  are 
largely  balanced  by  solute-solvent  interactions  in  the  free  state. 
Therefore,  restricting  consideration  to  such  conformations,  both 
AT^dw  and  the  van  der  Waals  contributions  to  the  desolvation 
free  energy  can  be  removed,  and  the  binding  free  energy  is  re¬ 
duced  to  the  smooth  components  described  by  (3).  Thus,  in  local 
minima  in  which  the  internal  and  van  der  Waals  terms  are  close 
to  zero,  the  free  energy  surface  is  essentially  determined  by  the 
“smooth”  free  energy  component  A Gs. 

B.  Implications  for  Global  Optimization  and  Our  Method 

Summarizing  the  discussion  thus  far  it  is  evident  that  the 
free  energy  function  in  (2)  has  a  huge  number  of  local  minima 
caused  by  high-frequency  terms;  primarily  the  van  der  Waals 
term.  Even  if  one  ignores  changes  in  the  interface  side-chains, 
local  minimization  in  SE{ 3)  [2]  enhanced  with  multistart  would 
have  to  be  quite  lucky  to  approach  a  global  minimum  in  some 
reasonable  time.  Generic  global  optimization  approaches  (e.g., 
those  supported  by  GAMS  [13])  typically  require  some  regu¬ 
larity  conditions  (e.g.,  twice  continuously  differentiable  func¬ 
tions)  and  are  not  designed  for  minimizing  complex  free  en¬ 
ergy  functions  having  a  huge  number  of  local  minima.  A  de¬ 
terministic  branch-and-bound  global  optimization  method,  the 
clBB  method  (see  [5]),  has  been  a  part  of  a  successful  pro¬ 
tein-folding  approach  [14]  but  it  can  be  costly  since  it  systemat¬ 
ically  explores  all  of  the  search  region.  The  same  will  be  true  for 
other  deterministic  approaches.  In  fact,  this  is  part  of  the  motiva¬ 
tion  of  the  hybrid  approach  by  Klepeis  et  al.  [15]  that  combine 
aBB  with  a  randomized  search  approach  using  ideas  from  ge¬ 
netic  algorithms.  To  the  best  of  our  knowledge,  the  only  system¬ 
atic  optimization  method  consistently  applied  in  protein-protein 
docking  is  simulated  annealing  [6].  The  CGU  method  [9],  which 
motivated  our  work,  has  also  been  used  in  protein  folding. 

Fortunately,  the  funnel-like  shape  of  the  energy  function  can 
provide  some  guidance.  Suppose  we  have  identified  (by  FFT  fil¬ 
tering  and  clustering)  a  number  of  conformations  in  the  native 
energy  funnel  and  we  seek  to  determine  the  native  conforma¬ 
tion.  The  low-frequency  energy  terms  [cf.  (3)]  form  a  smooth 
function  which  can  often  be  well  approximated  by  a  convex 
quadratic.  To  exploit  this  structure  we  work  on  the  envelope  sur¬ 
face  spanned  by  the  set  of  local  minima.  This  surface  inherits 
the  smooth  behavior  of  the  low-frequency  energy  terms.  More 
specifically,  we  generate  a  large  number  of  local  minima  and 
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construct  a  convex  quadratic  function  that  forms  the  tightest  un¬ 
derestimator  of  all  of  them.  This  quadratic  function  suggests  the 
location  of  the  deep  energy  valley  containing  the  native  struc¬ 
ture.  We  use  this  information  to  iteratively  refine  our  search  .  We 
note  that  this  approach  does  not  need  access  to  individual  energy 
terms;  it  only  requires  access  to  a  program  evaluating  AG. 

ID.  SDU:  SEMIDEFINITE  UNDEEESTIMATOR  METHOD 

SDU  consists  of  two  key  components:  i)  underestimating  the 
surface  spanned  by  the  local  minima  [see  Fig.  1  (right)],  and  if) 
using  the  underestimator  to  guide  further  exploration. 

A.  Constructing  an  Underestimator 

Let  us  denote  by  /  :  — *  R  the  free  energy  function  we 

seek  to  minimize  and  assume  we  have  obtained  a  set  of  K  local 
minima  $l ,  *  * . .  (j>K  of  /(■).  We  are  interested  in  constructing  a 
convex  quadratic  function  U($)  that  underestimates  /(■)  at  all 
local  minima  i  =  1,  *  *  * ,  K.  Let 

U{$)  =  +  W<f>  +  c 

where  Q  E  Rn,xn  is  positive  semidefinite  (hence,  £/(■)  is 
convex),  b  E  Un7  and  c  is  a  scalar. 

Using  an  LI  norm  as  a  distance  metric  the  problem  of  finding 
the  tightest  possible  such  underestimator  U  ( ► )  can  be  formulated 
as  follows: 

K 

min  y^(/(^)  —  c  —  Q<f>j  —  b'ft) 
j=t 

/(**)  >  c+4?rQ$  Lb =  1, . .  * ,  K,  Q  h  0  (4) 

where  the  decision  variables  are  Q,  b,  and  c,  and  ‘V  0”  denotes 
positive  semidefiniteness. 

Let  nowr  introduce  vectors  b  b-  >  0,  scalars  c+,c“  >  0 
satisfying  b  =  b“  —  b“  and  c  =  c+  —  c~  7  and  slack  variables 
s  =  (si, „ , , }  &k).  Define  the  block  diagonal  matrix 

Y  =  diag(Q:diag(b+),tliag(b_),e+,c_.diag(s)).  (5) 

Define  Fo  =  diag(diag(0),  — diag(e)),  F,  = 

diag , diag(^),  1,  -1, diag(ej)),  for 

all  j  =  X, . . .  ,  K7  where  0  is  the  (3n  +  2) -dimensional  zero 
vector,  e  is  the  A" -dimensional  vector  of  ones,  and  e ,  is  the  jth 
unit  vector.  Let  also  E*  j  denote  the  (3rc+ K+2)  x  {3n+ K + 2) 
matrix  with  all  elements  equal  to  zero  except  the  (i,j)th 
element  wrhich  is  equal  to  1.  Then,  (4)  can  be  written: 

(SDP-P)  max  Fo  •Y 

s.t.  F^.Y  =  /(**),  j  = 

Eiif-»Y  —0,  i  —  r?  -h  1 , . . .  >  j  =  1,  —  I 

Y  y  0  (6) 

where  the  decision  variable  is  the  matrix  Y.  Problem  { SDP-P } 
in  (6)  is  a  semidefinite  programming  ( SDP )  problem  (see  [16] 
or  [17]}.  SDP  problems  can  be  solved  efficiently  using  interior- 
point  methods  [16].  General  purpose  solvers  for  SDP  problems 
are  readily  available  (e.g.,  [18]). 


The  dual  to  (SDP-P)  in  (6)  is  the  problem 

K 

(LMI-D)  min  T2  x)fW ) 

j=i 

K  %n-\-K+2  i- 1 

«■*•*= +  E 

j==l  j=l 

Z±Q  (7) 

where  the  decision  variables  are  the  xfis  and  s.  Problem 
( L  MI-  D )  can  be  seen  as  the  problem  of  mini  mi  zi  ng  a  1  i  near  func  - 
tion  subject  to  a  linear  matrix  inequality  (LMI).  The  following 
theorem  summarizes  the  construction  of  the  underestimator. 

Theorem  III.  1:  Consider  a  Junction  f  :  Rn  — +  R  and  a  set 

of  local  minima  _ ,  $  h  of  /{• ) .  Let  ( Q ,  b+ ,  b^  ,  ,  c“ ,  s) 

form  an  optimal  solution  Y  of  Problem  (SDP-P)  in  (6), 
where  Y  is  defined  as  in  (5).  Let  U (f)  =  +  (b+  - 

fo~  Y<f>  +  (c+  —  c~).  Then  U(  )  satisfies  f(4>J)  >  for 

all  j  ~  1, . . . ,  K  while  minimizing  [|{/{<f>' ), ))  - 
(U(<f>L), . . . ,  t/(^Jl))|||.  Moreover,  the  dual  to  Problem 
(SDP-P)  is  the  LMI  problem  (LMI-D)  in  (7). 

Hereafter,  we  will  say  that  a  function  U(-)  satisfying  the 
statement  of  Theorem  ELI  underestimates  /(■)  at  points 
<f>]\  ...  .<j>h  .  Fig.  1  (Right)  illustrates  such  an  under  estimator. 


B.  Sampling 


Suppose  we  are  seeking  the  native  conformation  in  some  re¬ 
gion  B.  Using  a  set  of  local  minima  $  t . , , ,  E  B  of  /(■)  we 
construct  an  under  estimator  [/(/)  as  described  in  Section  IILA. 
Depending  on  the  samples  we  used,  and  assuming  that  the  con¬ 
structed  underestimator  reflects  the  general  structure  of  /( ■■),.  the 
global  minimum  of  U ( 1 ),  say  $r ,  is  in  the  vicinity  of  the  global 
minimum  of  /(*).  We  will  be  referring  to  ^  as  the  predictive 
conformation. 

Clearly,  the  distance  of  the  predictive  conformation  from  the 
native  one  depends  on  whether  the  underestimator  is  constructed 
using  a  sufficiently  rich  set  of  local  minima.  However,  even  if  that 
is  the  case,  locating  the  global  minimum  is  difficult  since  van  der 
Waals  interactions  create  a  large  enough  number  of  local  minima 
around  the  native  conformation.  Our  strategy  is  to  sample  in  the 
area  around  <f>1  such  that  conformations  close  to  are  more 
likely  to  be  selected,  in  addition,  conformal  ions  with  high  enough 
energy  can  safely  be  ignored.  This  can  be  achieved  by  using  the 
following  probability  density  function  (pdf)  in  B\ 


,  f/W  -  t/mMC  ^  U  W  -  Urn,, 

3(9  ~  tW)  d<t>  ~  ^ 


€B 


where  C7max  ^  maxg  U (fi)  and  A  denotes  the  integral  in  the 
denominator. 

To  generate  random  samples  in  B  using  the  above  pdf  we  will 
use  the  so  called  acceptance/ rejection  method.  Let  h(fi)  —  1  jV 
be  the  uniform  pdf  in  B  where  V  =  fg  dtfi.  Notice  that 


9#)  < 


v(U(4>f)  -  tW) 


B 


and  set  equal  to  the  ratio  of  the  left  hand  side  over  the 

right-hand  side  of  the  above,  i.e., 


U(4>)  -  Unm 
U($p)  -  (U 
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1)  Generate  uniformly  distributed  random  variables  Xi  E 

and  xa  E  [£,  1]. 

2)  If  <  7?(x|),  stop  and  output  d>  —  x^;  otherwise,  return 
to  Step  I. 


Fig,  2,  Algorithm  generating  a  sample  in  B  drawn  from  g(-)  with  associated 
density  in  [£,  1], 

In  order  to  discard  high  energy  conformations  we  are  interested 
in  sampling  points  in  B  with  associated  probability  density  in 
some  interval  [£*1],  where  £  E  [0,  I).  The  algorithm  in  Fig.  2 
outputs  such  a  sample  point.  To  see  that  notice  that  in  Step  1  we 
generate  uniformly  distributed  samples  in  the  set  {(x:y)  |  x  E 

B,  y  E  [(,  1]}.  The  rule  of  Step  2  accepts  uniformly  distributed 
samples  in  {(x,  y)  |  x  6  B,  £  <  y  <  g(x) }.  Thus,  the  output  0 
of  the  algorithm  is  distributed  in  B  according  to  <y(  ). 

We  finally  note  that  the  algorithm  in  Fig.  2  requires  knowing 
f/max .  In  many  cases,  this  is  straightforward  to  obtain.  Consider 
for  instance  the  case  where  B  is  a  polyhedron;  one  special  case 
of  practical  interest  is  when  if?  is  a  box  set  of  the  type  {x  |  < 

Xi  <  v-i ,  i  —  !....,  7i}_  Then,  since  U(-)  is  convex  it  achieves 
its  maximum  at  an  extreme  point  of  B.  Alternatively,  one  can 
use  an  estimate  of  I7inaXj  e.g.,  max^  U {()>  ). 

C.  The  SDU  Algorithm 

We  now  have  all  the  ingredients  to  present  our  SDU  algo¬ 
rithm.  The  algorithm  seeks  a  global  minimum  of  /(♦)  in  some 
region  B  of  the  conformational  space;  it  is  presented  in  Fig.  3. 
Throughout  the  algorithm,  we  maintain  a  set  C  of  interesting 
local  minima  obtained  so  far  including  the  best  such  local  min¬ 
imum  denoted  by  <f>G .  The  evolution  SDU  depends  on  the  pa¬ 
rameters  K,  £  E  [0, 1]T  m  and  e7  and  on  the  way  we  define  the 
neighborhood  )  in  Step  5.  These  will  be  appropriately 

tuned  in  every  problem  instance. 

A  couple  of  remarks  on  the  proposed  SDU  algorithm  are  in 
order.  The  algorithm  combines  exploration  with  focalization  in 
energy  favorable  regions  of  the  conformational  space  (energy 
funnels).  The  exploration  is  in  fact  biased  towards  these  energy 
favorable  funnels.  This  is  intended  to  avoid  computationally  ex¬ 
pensive  exploration  in  areas  of  the  conformational  space  that  are 
not  likely  to  contain  the  native  structure. 

We  should  point  out  that  we  make  no  claims  that  SDU  con¬ 
verges  to  the  global  minimum  of  /(’).  In  fact,  it  is  straight¬ 
forward  to  see  that  it  will  not  find  the  global  minimum  if  we 
do  not  use  enough  local  minima  to  determine  the  underesti¬ 
mator  or  when  /{-)  is  arbitrary  and  does  not  have  a  funnel-like 
shape.  However,  later  in  the  paper  we  will  provide  arguments 
that  (probabilistically)  guarantee  convergence  for  funnel-like 
functions  under  a  suitable  set  of  conditions. 

IV.  The  Importance  of  the  Underestimator's  Structure 

Next,  we  review  some  earlier  work,  study  its  limitations,  and 
assess  the  relationships  with  SDU.  Let  us  consider  the  SDU  al¬ 
gorithm  under  the  following  key  modifications. 

1)  Underestimation:  In  deriving  £/(■)  impose  the  constraint 
that  Q  is  a  diagonal  positive  semidefinite  matrix.  Then  the 
semidefinite  constraint  can  be  replaced  by  a  non -negativity 
constraint  for  all  diagonal  entries.  It  follows  that  Problem 


n 

1)  Initialization:  Starting  tan  K  random  points  in  BB 

perform  local  minimization  to  obtain  K  (K  >  (jH- 1  )(n+2)/2 ) 
distinct  local  minima  of  /(■).  Set  .if  — 

{&1  .  .  .  ,(f>h  }  and  <yc  =  arginin^^ . ^  /(>e). 

2)  Underestimation:  Solve  Problem  (SDP-P)  in  (6)  to  ob¬ 

tain  the  underestimator  U{(j?).  Set  the  predictive  confor¬ 
mation  equal  to  a  minimizer  of  (/(-)•  that  is.  when  Q  is 
invertible  *b. 

3)  Elimination;  Discard  unfavorable  local  minima  from  if; 
more  specifically,  set  J?  :=  Jaf  \  {<£  e  \  R(4>)  < 

£  and  <f>  f=  <fiCl )  - 

4)  Focalization:  Define  a  neighborhood  ,/U(rpr)  C  ‘M  of 

set  m  \=  jY(<f>py 

5)  Exploration: 

a)  Start  from  and  use  local  minimization  to  obtain 

■■■  }  f-  P 

a  local  minimum  4>  of  /(■),  If  4>  C  3$  and 
0P  £  .if,  set  Jg  :=  if  U  {4 >P}  and  4>1t  := 

b)  Obtain  m  samples  from  the  sampling  algorithm  of 

Fig.  2.  Using  these  samples  as  starling  points  perform 
local  minimization  to  obtain  m  distinct  local  minima 
xVv*,xm  of  /(.).  Set  :=  JSf  U  {x  |  x  = 
x1 . ,  xTn ,  x  E  £  if}  and 

4>G  “  arg  min  f(4>). 

0-^.X1 . Km 

6)  Termination:  If  ||0G  -  <ph'  |  <  €  or  if  there  is  no 
progress  in  reducing  f{4>G)  over  several  iterations  then 
stop;  otherwise  go  to  Step  2, 


Fig.  3.  SDU  algorithm. 

(SDP-P)  can  be  reformulated  as  a  linear  programming 
problem  ( LP)  which  can  be  easily  solved. 

2)  Sampling:  Replace  our  biased  sampling  with  random  (uni¬ 
form)  sampling  in  the  neighborhood  A of  <f>P . 

The  resulting  method  is  the  so  called  convex  global  underes¬ 
timator  (CGU)  method  by  Phillips  et  ah  [9]  for  minimizing 
funnel-like  free  energy  functions  in  molecular  modeling  prob¬ 
lems.  We  will  argue  that  these  two  differences  with  our  proposed 
SDU  drastically  affect  the  performance  of  the  corresponding 
optimization  method.  In  particular,  limiting  the  underestimator 
search  to  the  class  of  canonical  parabolas  substantially  reduces 
the  efficiency  and  accuracy  of  CGU  for  general  problems  where 
the  surface  spanned  by  the  local  minima  is  not  typically  aligned 
with  the  canonical  coordinates  defining  the  underestimating 
parabola.  Dennis  and  Vajda  [7]  report  many  such  cases  where 
CGU  performs  poorly.  Some  attempts  in  addressing  this  lim¬ 
itation  have  been  made  by  Rosen  and  Marcia  [19]  but  again 
handling  special  cases. 

A.  Comparing  the  CGU  and  SDU  Underestimating 
Approaches 

As  we  discussed  in  Section  III,  a  quadratic  underestimator 
will  not  be  informative  if  either  i)  /(■}  is  not  funnel-shaped  and 
the  envelope  of  local  minima  can  not  be  well  approximated  by 
a  convex  quadratic,  or  it)  if  we  do  not  use  a  rich  enough  set 
of  local  minima  in  constructing  £/(*).  In  the  following  we  wish 
to  remove  these  two  potential  sources  of  poor  performance  in 
order  to  better  assess  the  underestimating  power  of  CGU  and 
SDU.  More  specifically,  we  consider  the  “ideal”  case  of  under¬ 
estimating  a  convex  quadratic  given  by  /( t)  —  t'Qt  +  b't  +  c7 
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where  Q  y  0,  We  assume  that  0  an  infinite  number  of  sample 
points  of  /(-)  in  some  compact  sampling  region  B  is  at  our  dis¬ 
posal  when  we  construct  the  underestimator,  and  ii)  /(■)  is  con¬ 
tinuous  on  B.  The  construction  of  the  underestimator  based  on 
utilizing  ail  sample  points  in  B  can  be  formulated  as  the  fol¬ 
lowing  (infinite-dimensional)  optimization  problem: 

min  /  (/( t)  —  Lr(t))  dt  s,L  /( t)  >  J7(t),  t  e  B 

JtGB 

(B) 

where  the  decision  variables  are  the  (yet  unspecified)  parameters 
defining  t/(t). 

Suppose  first  that  we  use  the  SDU  underestimating  approach 
and  seek  to  construct  a  function  U(  t)  =  t'Qt  +  b't  +  c*  where 
Q  t  0.  Consider  the  problem  in  (8)  for  such  a  U(t).  The  next 
proposition  is  immediate. 

Proposition  IV.  I:  SDU  can  underestimate  /(  ■  )  exactly,  in 
particular,  (Q.  b.  c)  =  ( Q .  b .  c)  is  an  optimal  solution  of  (8). 

We  next  consider  the  CGU  underestimation  approach.  Specif¬ 
ically,  we  seek  to  construct  a  function  t/(t)  —  t  Dt  +  b't  +  c, 
where  D  =  diag(di,  * . . ,  du  )  and  dt  >  0  for  i  —  1,  *  *  *  *  n. 
Consider  the  problem  in  (8)  for  such  a  U[ t).  For  simplicity 
of  exposition  let  us  assume  that  the  search  region  B  is  the  hy¬ 
percube  defined  as  B  —  B\  X  ♦  *  ■  x  where  £5*  —  [/$ ,  %;] 

and  u}  —  lt  —  T  for  all  i  —  1,,,,  ,n.  We  denote  a(t)  = 


and  z  =  (di , , , . .  dn,  b\ , . , . ,  bn,c).  Then,  after  some  algebra, 
(8)  becomes  equivalent  to 


(LSIP-P)  max  h  z  s.t.  a'(t)z  <  /( t),  t  G  B 

(9) 

where  z  is  the  decision  vector.  Note  that  (LSIP-P)  involves  an 
infinite  number  of  constraints.  A  problem  with  such  a  struc¬ 
ture  is  known  as  the  linear  semi -infinite  programming  (LSIP) 
problem  (see  [20]}.  Its  dual  can  be  formulated  in  measure  space 
as  follows: 

(LSIP-D)  min  [  f(t)dfi(t) 

Jb 

s.t  /  B(t)dp(t)  =  p  £  M+(B)  (10) 

Jb 

where  M+(B)  denotes  the  set  of  non-negative  regular  Borel 
measures  on  B. 

I )  Approximate  Problem  Converges  to  LSIP:  Next  we  ex¬ 
plore  the  relationship  of  the  underestimator  obtained  by  solving 
(LSIP-P)  in  (9)  to  the  underestimator  obtained  as  in  Sec¬ 
tion  III,  namely,  by  using  the  function  values  /(t1 ,  /( t ) 
at  a  set  of  samples  t1, , . . ,  t h  from  B . 

We  introduce  some  additional  terminology  and  notation. 
Let  Afgn+i  be  the  cone  generated  by  a(t)  for  t  G  15,  i.e., 
M2n+ 1  —  jw  |  w  -  a(t)d/i(t),/i  G  Af+(B)}.  We  denote 

by  int(M2n+i)  the  interior  of  We  also  denote  by 

L>(h,  0)  the  level  sets  of  the  (primal)  problem  (LSIP-P),  in 


particular,  L£(h,0)  =  {z  |  a'(t)z  <  /(t),Vt  G  S;h'z  >  #}. 

We  will  say  Fhat  the  primal  (LSIP-P)  is  consistent  if  it  has  a 
feasible  solution.  We  will  say  that  (  LSIP-P)  is  superconsistent 
if  the  interior  of  the  feasible  set  is  non-empty,  that  is,  if  there 
exist  z*  such  that  ay (t)z*  <  /( t)  for  all  t  €  B  (this  is  Slater’s 
condition).  Similarly,  we  will  say  that  the  dual  (LSIP-D)  is 
consistent  if  h  G  Mzn+il  the  dual  will  be  termed  superconsis¬ 
tent  ]fh  E  int(Man+i). 

Lemma  IV.2 :  Assume  that  (LSIP-P)  is  consistent.  Then, 
( LSIP-  D )  is  superconsistent  and  all  primal  level  sets  L  J  ( h ,  B } 
are  bounded. 

Proof:  Recall  that  we  have  assumed  B  to  be  compact  and 
/(-)  continuous  on  B.  We  will  first  show  that  (LSIP-D)  is  su¬ 
perconsistent.  To  that  end,  note  that  the  uniform  measure  in  B 
is  a  feasible  solution  of  (LSIP-D),  which  implies  that  h  G 
M2n+i-  Furthermore,  the  boundary  of  M2ti+i  consists  of  vec¬ 
tors  a(t*)  for  some  t*  E  B.  Observe  that  h  is  not  on  the 
boundary,  i.e.,  h  G  int(M2rt+i)  which  establishes  the  super¬ 
consistency  of  (LSIP-D),  Using  a  result  from  [20,  Th.  6.11] 
the  boundedness  of  all  primal  level  sets  follows.  ■ 

Next  consider  an  approximate  finite  problem  constructed 
from  (LSIP-P)  by  only  enforcing  a  finite  number  of  con¬ 
straints,  say  for  tL, , , . ,  tA  G  5,  More  generally,  let  B  C  B  be 
a  finite  subset  of  B  and  form  the  LP 


(LF-P)  max  h'a  s.t.  a'(t)z  <  /( t)t  t  G  B  (11) 


which  can  be  seen  as  a  finite  approximation  of  (LSIP-P).  De¬ 
fine  d,(B)  ~  maxtGg  min^-g  ||t  —  t||  as  a  metric  of  density  of 
B  in  B.  Then,  the  boundedness  of  all  (LSIP-P)  level  sets  suf¬ 
fices  to  guarantee  the  convergence  of  the  (LP-P)  solutions  to 
the  (LSIP-P)  solutions.  The  result,  which  is  from  [20],  is  pro¬ 
vided  in  the  following  theorem. 

Theorem  IV. 3:  For  every  e_  >  (}  there  exists  Se  >  0  such 
that  for  all  B  C  B  with  d(B)  <  (%  it  is  the  case  that  (LP-P) 
is  solvable,  and  for  every  solution  z  of  ( LP-Pj  there  exists  a 
solution  z*  of  (LSIP-P)  such  that  ||z  —  z*||  <  e. 

The  result  is  insightful  because  it  suggests  that  when  we  use 
enough  samples  the  quality  of  the  CGU  underestimator  does 
not  depend  on  sample  selection  but  rather  on  the  fundamental 
structure  of  the  underestimator  function. 

2)  Limitations  of  the  CGU  Underestimator:  We  now  have 
the  necessary  machinery  to  analyze  the  CGU  underestimator.  In 
particular,  we  consider  a  class  of  functions  to  be  underestimated 
and  demonstrate  how  the  CGU  underestimator  fails. 

Suppose  we  wish  to  underestimate  /( t)  —  t'Qt  4-b't,  where 


Q  =  diag 


£*1 

0 


(12) 


n  >  2,  >0  for  all  i,  0  <  0,  and  ctiCV2  —  02  >  0  to 

guarantee  Q  y  0.  This  choice  of  Q  can  be  seen  as  the  sim¬ 
plest  Q  with  off-diagonal  elements;  the  argument  we  present 
next  can  be  generalized  to  more  general  cases  as  well.  Let  the 
underestimator  be  given  by  t/(t)  =  PDt  +  b't  +  c,  where 
D  =  diag(di , . . . ,  dn)  >:  0,  b  =  (fq,  — ,  bn)  and  c  G  R.  The 
CGU  underestimator  is  informative  if  it  manages  to  locate  the 
minimum  of  /( t).  In  particular,  we  would  like  the  minimi zer  of 
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{/(t)  to  coincide  with  the  minimizer  of  /( t).  We  will  show  by 
contradiction  that  this  is  not  possible  in  general. 

Consider  an  instance  of  the  underestimation  problem  at  hand 
where  B  is  the  hypercube  defined  as  S  =  Si  x  ■  *  -  x  Bn  where 
all  Bi's  are  identical  and  given  by  Bt  =  [— 2T/3,T/3]  for  all 
i  =  1,  * . . ,  n  and  T  >  1.  (LSIP-D)  takes  the  form 


Next,  fix  some  j  £  {3, * , , ,  n}  and  some  k  >  3 fT  and  apply 
(14)  at  t  such  that  f  j  =  —  1  /k,U  —  l/k  for  aU  l  ^  j  and  at  t 
such  that  tj  —  l/k,  t[  —  —l/k  for  all  /  f  j.  These,  along  with 
(17),  c  —  0,  and  (16),  imply 

b\  4-  62  =  &i  +  ^2  bj  —  j  =  3, . « * ,  n.  (18) 


(LSIP-D)  min  ^  a  J2  +  ^  /A  +  2/TEjAA] 

i=l  i  =  l 

s.t.  /  a(t)d//(t)  —  h,  fi  £  M+(B)  (13) 

Jb 

where  [*]  denotes  expectation  with  respect  to  fi,  I2  = 
(1  /T)JBit2dt  =  T2/9,  and/i  =  (1  /T)  fBtdt  =  -T/6. 
Since  3  <  0  the  dual  problem  is  equivalent  to  the  problem  of 
selecting  a  measure  fi  in  order  to  maximize  E^jtA]  subject  to 
the  constraint  that  the  first  and  second  moments  of  the  marginal 
distributions  under  p  match  those  of  the  uniform  distribution 
(cf.  definition  of  h  in  Section  IV-A).  However,  for  any  measure 
f.i  6  M+(£)  the  expectation  E^[(£i  —  t2)2]  is  nonnegative, 
hence 

2E/4 [^1^2]  S  =  2/5. 


Now,  for  any  j  £  |3f . . , ,  n}  and  some  k  >  3/7T  apply  (14)  at 
t  such  that  tj  =  \/2 jk,  ii  =  l/k  for  all  /  ^  j  and  at  t  such  that 
tj  —  0,  ti  —  1  jk  for  all  l  ^  j.  These  inequalities,  along  with 
(17),  c  =  0,  and  (16),  lead  to 

d\  +  d2  =  oi  +  a*  +  2ft  dj  —  aj ,  j  —  3, .  *  * ,  n ,  (19) 

To  proceed  consider  z  with  c  =  0  satisfying  (18)  and  (19). 
Let  x  =  d\  —  a\  —  ft  and  y  —  h\  —  bi*  We  will  show  that 
(x7y)  =  (0,0)-  Assume  otherwise.  Using  (18)  and  (19)  the 
primal  feasibility  condition  (14)  reduces  to 

{h  ~  £2 )[/?(*!  ~  £2)  -\-x(h  +*2)  +  3/]  <  0,  Wi,  t‘2  €  Bi*  (20) 

Let  0  <  6  <  Tj 3  and  apply  the  above  at  (/,tl  J2)  =  (5,  —5)  and 
(/i,<2)  =  (S,  6).  It  follows 

2/36  <y<  - 2/36 . 


It  follows  that  at  optimality  the  above  inequality  holds  with 
equality,  that  is,  ti  and  1%  are  col  inear. 

Next,  let  V^LsiP-P)  and  ^(lsip-D)  denote  the  optimal 
values  of  (LSIP-P)  and  (LSIP-D),  respectively.  We  have 
already  shown  (cf.  Lemma  IV. 2  )  that  (LSIP-D)  is  super- 
consistent.  Moreover,  the  argument  in  the  previous  paragraph 
suggests  that  VjxsiP-D)  is  finite.  Consequently,  (LSIP-P) 
has  an  optimal  solution  and  V(  lsip-P)  =  V/lsip-d)  (cf.  [20, 
Th.  6.9]).  In  the  terminology  of  [20]  we  have  perfect  duality. 
The  necessary  and  sufficient  conditions  for  optimality  of  a 
primal-dual  pair  are  primal  feasibility,  dual  feasibility,  and 
perfect  duality. 

Let  (z,  ft)  be  an  optimal  primal-dual  pair,  where  z  = 
(di7 ...  ,  hn,  c)7  and  ft  is  a  measure  in  M+(B)  with 

the  properties  identified  previously.  The  optimality  conditions 
are 


^  ^  t;  dj  +  N  ^  tibj  +  c  ^  ^  oti  -\-2fttit2  +  ^  "fj  bi ,  Vt  £  B 

i=l  i=L  i  =  l 

(14) 

/  a(t)tf/i(t)  =  h,  fi  e  M+(B)  (15) 

Jb 

n  n  n  n 

J2  hdi.  +  Y.hk+c  =  53  a(/ 2  +  53  hk  +  2 0I2.  (16) 


If  y  0,  by  selecting  6  close  enough  to  zero  we  can  violate 
primal  feasibility.  Therefore,  y  =  0.  Further,  let  1  <  k  <  7  and 
apply  (20) at  (t-i,t2)  =  (fe/3, 1  /3)  and  (i,  ,t2)  =  (1/3;  ft/3).  It 
follows: 


fc  - 1 
k  H~  1 


<  x  <  ft 


1  -  k 

k  +  1 


If  x  ^  0,  by  selecting  fc  dose  enough  to  one  we  can  violate 
primal  feasibility.  Therefore,  x  —  0.  In  summary,  we  have 


di  =  q;  +  ft,  i  =  1, 2  dj  -  aj ,  j  -  3, .  •  * ,  n 
bj  =Sj,  j  =  1,-  - 1  n  c  -0.  (21) 


The  minimum  of  the  CGU  underestimator,  say  t,  is  given  by 
t  =  -(l/2)D_1b  which  due  to  (Section  IV-A.II)  yields 


U  -  - 


2  (cu+ff)' 


=  1.2  =  =  3 . 


However,  the  minimum  of  /(■),  say  t*s  is  given  by 

1 


*1=  - 


2(a1a2-Xj{a2bl-^) 

(0:162-^1),  —  3  —  3)  • 


2  2(tiiO;2  —  X) 


2a  j 


i- 1  i=  1  1=1  i-l 

Consider  now  the  primal  feasibility  condition  (14)  at  t  —  0 
and  at  t  =  —  2Te/3.  These  two  inequalities  along  with  (16) 
imply  c  ^  0.  Furthennore,  (14)  at  t  =  —  Te/3  and  at  t  =  Te/3 
along  with  (16)  yield 

ft 

53&-5i)=o.  ^17> 

i= 1 


Clearly,  t  is  different  from  t*  in  general  and  the  CGU  underes- 
timator  can  not  locate  the  minimum  of  /( ■).  We  summarize  the 
result  of  this  subsection  in  the  following  theorem. 

Theorem  TV. 4:  Let  /( t)  =  t/Qt  -j-  b't,  where  Q  0  is 
given  in  ( 12).  Further,  let  U(t)  =  t'Dt  +  b't +c  be  the  function 
formed  from  the  optimal  solution  to  (LSIP-P),  i.e.,  the  optimal 
CGU  underestimator  to  /( t).  Then,  in  general,  the  minimizer  of 
the  underestimator  is  different  from  the  minimizer  off  (t). 
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The  result  of  Thm.  IV.4  is  significant  since  it  establishes  that 
the  CGU  underestimator  cannot  locate  the  minimum  of  /(■) 
even  if  it  uses  an  infinite  number  of  samples.  In  contrast,  ac¬ 
cording  to  Proposition  IV.  1  and  Theorem  IV. 3,  SDU  will  find 
the  minimum  of  /(-)  as  the  number  of  samples  grows  to  infinity. 

V.  On  SDU’s  Convergence 

In  this  section,  we  provide  a  rough  analysis  of  the  SDU  algo- 
rithm  and  show  that  under  appropriate  conditions  it  converges 
to  the  global  minimum  of  /(■).  Since  the  free  energy  functions 
we  seek  to  minimize  are  funnel-like,  for  the  remainder  of  this 
section  we  will  impose  a  set  of  structural  assumptions  on  /(■) 
and  the  search  region  B  that  reflect  this  structure.  We  denote 
by  epi (/)  the  epigraph  of  /(■)  which  is  defined  as  epi (/)  — 
{  |  <j>  E  B,  f($)  <  u;}.  We  also  denote  by  conv{<S)  the 

convex  hull  of  any  set  5. 

Assumption  A:  Assume  that  /(<£)  satisfies  the  following  set 
of  conditions:  i)  it  is  continuously  differentiable;  ii)  f(f)  has 
a  unique  global  minimum  in  13;  Hi)  B  is  compact;  iv)  for  all 
local  minima  of  /(■)  there  exists  an  open  set  such  that  <j>  is  the 
unique  minimize r  of  f  {  ■  )  within  this  set;  (iv)  the  extreme  points 
c?/conv(epi(/))  lie  on  a  quadratic  function  =  $  Q<j>  + 
b7^+c;  v)  U (4>)  has  a  unique  global  minimum  which  is  identical 
with  the  global  minimum  of  /(-)  in  B. 

For  functions  that  satisfy  Assumption  A  we  will  say  that  they 
have  a  funnel-like  shape  [see  Fig.  1  (Right)].  The  next  proposi¬ 
tion  provides  a  sufficient  condition  for  exactly  determining  the 
quadratic  function  U ( ■ )  which  tightly  underestimates  all  local 
minima  of  /(■). 

Propos  it  ion  V.  1 :  he  t  A  ss  umption  A  prevail.  Con  sider  th  e  SD  U 
algorithm  of  Fig.  3  and  suppose  that  in  Step  I  we  obtain  at  least 
K  —  ((n  T  l)(n  -f  2))/2  distinct  local  minima  7 , . . ,  <f>h  of 
/(■)  which  are  extreme  points  of  coov(epi(/)).  Then ;  the  under- 
estimator  U(*)  obtained  in  Step  2  is  identical  to  !/(-). 

Proof:  Constructing  the  underestimator  U {♦)  is  equivalent 
to  solving  problem  (4).  Clearly,  (Q.  b,  c)  =  (Q.  b ,  c)  is  an  op¬ 
timal  solution  because  it  is  feasible  and  achieves  a  zero  cost.  At 
local  minima  (/>J  we  have  Uififi )  —  )  for  all  j  —  i, ....  K 

which  form  a  linear  system  of  equations.  This  linear  system 
has  { n 2  +  n)j 2  unknown  elements  of  Q  plus  n  unknown  el¬ 
ements  of  b  plus  the  unknown  value  of  c.  It  follows  that  for 
K  >  ((tj  -h  l){n  +  2))/2  we  obtain  the  exact  quadratic  func¬ 
tion  which  passes  from  all  extreme  points  of  conv(epi(/)).  ■ 

The  following  theorem  establishes  that  given  sufficient  sam¬ 
pling  of  the  search  region  B  the  SDU  underestimation  procedure 
can  locate  the  global  minimum  of  /(■)  which  we  denote  by  <j>  . 

Theorem  V.2:  Let  Assumption  A  prevail.  Consider  the 
SDU  algorithm  of  Fig.  3  and  assume  that  8  contains  at  least 
({n  +  1)(ti  +  2)}/2  distinct  local  minima  of  /(-)  which  are 
extreme  points  qfconv(epi(/)).  Suppose  that  in  Step  1  of  the  al¬ 
gorithm  we  obtain  K  local  minima  01, _ ,  (f>K  of  f{  *  )  starting 

from  uniformly  distributed  starting  points  in  B.  Then T  the  global 
minimum  of  the  underestimator  U(-)  obtained  in  Step  2  of 
the  algorithm  converges  in  probability  to  the  global  minimum 
<p*  of  f(- )  as  K  —*  oo,  namely)  limK->oo  P[^  —  <£*]  =  1- 

Proof:  Due  to  Assumption  A  function  /(■)  satisfies 
all  the  conditions  of  the  capture  theorem  [21,  Prop.  1.2.5]. 
In  particular,  for  every  local  minimum  of  /{♦)  which 
is  an  extreme  point  of  conv(epi(/))  there  exists  an  open 
set  Sj  =  {<j>  I  110  -  4>'\\  <  e)  such  that  any  convergent 


local  minimization  algorithm  initialized  with  any  point  in 
Sj  converges  to  $  3  We  assumed  that  B  contains  at  least 
M  —  ((n  H-  l)(rt  +  2)  )/2  distinct  local  minima  of  /(-)  which 
are  extreme  points  of  couv(epi(/)).  Without  loss  of  generality 
let  01,...,0U  £  B  and  set  Pj  =  (/*es.ne  #)/(J*eB 
for  j  —  1, . . . ,  M.  Note  that  there  exists  an  t  >  0  such  that 
p  j  >  e  for  all  jf.  By  virtue  of  Proposition  V.l  we  need  at  least 
M  =  ((n  +  l)(n  +  2))/2  distinct  local  minima  of  /(■)  which 
are  extreme  points  of  eonv(epi(/))  in  order  to  exactly  construct 
U($);  this  suffices  to  yield  <f>* .  Let  P  [success]  the  probability 
that  we  do  indeed  obtain  M  distinct  local  minima  of /(■)  which 
are  extreme  points  of  conv(epi(/)}  by  randomly  sampling  K 
points  in  B  and  performing  local  minimization  starting  from 
every  sample.  Let  also  Af  the  event  that  such  random  sampling 
yields  y  j  =  1. . . . ,  M,  at  least  once.  For  any  event  A  we 
denote  by  ^4  its  complement.  We  have 

P [success]  >  P [.4,i  n  ■  ■  ■  n  Am]  —  1  —  P[A\  U  ■  -  U  34a7| 
at  Af 

>  1  -  £  pra  =  1  -  E^1  -  of-  (22) 

3= 1  3=1 

The  first  inequality  shown  previously  is  due  to  the  fact  that  on 
the  right-hand  side  we  consider  only  local  minima  <f>1 ,  *  , . ,  ; 

B  may  contain  additional  local  minima  which  are  extreme  points 
of  conv(epi(/)).  The  second  inequality  above  uses  the  union 
bound.  Next  observe  that  since  the  p}  are  bounded  away  from 
zero  the  right  hand  side  of  (22)  converges  to  one  as  A*  — >  oo.  ■ 

We  can  think  of  Theorem  V.2  as  a  simple  sanity  check  of 
the  SDU  underestimation  approach.  It  implies  that  given  an  ap¬ 
propriate  funnel- like  structure,  SDU  can  locate  the  global  min¬ 
imum  of  the  free  energy  function  with  probability  approaching 
one  as  the  number  of  random  samples  in  B  increases.  Actual 
free  energy  functions  are  expected  to  deviate  from  the  structure 
put  forth  in  Assumption  A.  In  particular,  the  local  minima  of 
/(*)  may  not  be  tightly  underestimated  by  a  quadratic  function 
and/or  the  global  minimum  of  a  quadratic  underestimator  may 
not  coincide  with  4?  ■  These  factors  have  motivated  the  form  of 
the  SDU  given  in  Fig.  3. 

VI.  Numerical  Results  for  a  Set  of  Test  Functions 

In  this  section,  we  apply  SDLI  to  test  functions  that  were  se¬ 
lected  to  resemble  free  energy  functions.  Throughout  we  com¬ 
pare  SDU  with  CGU.2  Here,  and  for  all  numerical  results  re¬ 
ported  in  this  paper,  we  have  used  the  SDPA  [IB]  semidefinite 
programming  solver. 

The  test  functions  we  use  are  from  [7]  and  are  given  by 

/W=/iW  +  /aW+/sW 

n/2  n 

h(<t>)  =  X)(a02i-1  +  hit)  =  5^A|sm(702i)| 

i=  1  i= 1 

nf  2 

h{4>)  =  y^H?cos(t%0&-i)  -  d  oos{^02*)]  (23) 

'i—  1 

Tor  simplicity  of  the  exposition  we  assume  that  we  can  determine  exactly. 
Local  minimization  algorithms  can  approach  arbitrarily  closely.  Using  con¬ 
tinuity  arguments  a  similar  proof  can  establish  that  &F  converges  in  probability 
to  an  arbitrarily  small  neighborhood  of  as  K  — ¥  oo. 

2We  followed  the  original  CGU  algorithm  as  described  in  [9]  and  imple¬ 
mented  in  code  made  available  by  the  authors  of  [9]. 
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Fig.  4.  SDU  derives  tighter  underesttniators. 


where  $  =  (<£i,  * , . ,  <j>n)7  a  —  1,  b  =  2T  c  =  3,  d  =  4,  a  = 
3?r,  /3  —  4tt,  and  7  —  Note  that  the  origin  is  the  global 

minimum  of  /(■).  The  parameter  .4  determines  the  amplitude 
of  the  high  frequency  term  /3 (  ) ,  thus,  the  larger  A  the  more 
“noisy”  and  harder  to  optimize  we  expect  /  (  )  to  be. 

First,  and  to  illustrate  once  again  the  flexibility  that  a  general 
quadratic  underestimator  adds,  we  consider  a  version  of  the  test 
function  in  (23)  for  n  —  2  and  rotate  the  coordinate  system  with 
respect  to  the  two  axis  by  45°  .3  Fig.  4  (top)  depicts  the  underesti¬ 
mators  (along  a  certain  vertical  hyperplane)  obtained  by  CGU  and 
SDU,  denoted  by  U cg  u  ( * )  and  Usdu  ( T )  respectively,  in  their  first 
iteration  based  on  20  (=  Sn  +  4  for  n  =  2  as  required  by  CGU) 
uniformly  sampled  local  minima.  Ucgu($)  has  the  global  min¬ 
imum  at  (0.5225  ,  -07238)  and  USdu(4>)  at  (0.2669 , 0.4581), 
which  is  closer  to  the  true  global  minimum  (0,  0).  The  bottom 
figure  in  Fig.  4  depicts  the  contours  of  the  two  underestimators 
illustrating  that;  they  predict  rather  different  global  minima. 

Our  next  example,  concerns  the  six-dimensional  version  of 
the  test  function  in  (23).  This  is  in  fact  the  appropriate  space  for 
rigid  docking  problems  (translation  and  rotation).  We  applied 
both  CGU  and  SDU  and  report  various  indicators  of  their  per¬ 
formance.  For  both  cases  the  search  region  B  was  selected  as  the 
box  \<j>  |  —  1 00  <  <  10,  i  =  1 . , , ,  n}  so  that  the  global 

minimum  at  the  origin  is  not  at  the  center  of  B .  We  used  8n  -F  4 
initial  samples  in  B  as  required  by  CGU  (which  is  more  than  what 
Proposition  V.  1  suggests  for  SDU  when  n  =  6).  We  use  the  fol¬ 
lowing  notation:  i)  4*1)  denotes  the  initial  predictive  conformation, 

that  is,  the  minimum  of  the  initial  underestimator  we  obtain  (Step 

p 

2  of  SDU);  ii)  0O  denotes  the  local  minimum  of  /(*)  obtained 

BThat  is,  we  minimize  /( R  'p)  where  R  is  the  product  of  axis  rotation 
matrices,  one  for  each  axis. 


TABLE  1 

CGU  and  SDU  Applied  to  the  Test  Function  in  (23)  When  n  =  6 


Method 

CGU  |  SDU 

CGU  |  SDU 

CGU  |  SDU 

A 

10 

20 

30 

D{* 0) 

0,701 

0.448 

1.343 

0.772 

1.468 

0.983 

9*s 

0.291 

0.164 

1.185 

0.295 

0.659 

0.429 

00 

0.864 

0.654 

1.558 

1.155 

1.713 

1.466 

0.527 

0.558 

1.281 

0.802 

0.921 

0.930 

I(&) 

18.260 

13.15  1 

56. 359 

48.955 

92.472 

76.  OKI 

13.532 

12.435 

35.353 

21.347 

28.816 

29.100 

4f 

0,752 

0.435 

L4O0 

0.897 

1.723 

1,067 

0.578 

0.304 

0,695 

0.63  ] 

0.729 

0.566 

/(#■} 

-3.424 

4.659 

15.247 

28.911 

26,321 

49.757 

af(4G\ 

6.425 

9.435 

7.304 

13.810 

9.298 

20.167 

rim 

69 

95 

46 

8S 

41 

S3 

nf 

157.30 

91.67 

153.66 

91.48 

169.52 

92.38 

jP  -  ■  G 

by  starting  the  local  minimization  at  <j>0  ;  in)  4>  '  denotes  the  best 
local  minimum  of  /  ( ■ )  found  throughout  the  evolution  of  the  algo¬ 
rithm;  iv)  D(<j>)  denotes  the  (Euclidean)  distance  of  a  confonna- 
tion  0  from  the  known  optimal  conformation  which  is  at  the  origin, 
i  e.,  D{tf>)  =  (5Ii=i  v)  rj  denotes  a  metric  of  success  of 

the  corresponding  algorithm  adjusted  to  the  difficulty  of  the  op¬ 
timization  problem,  more  specifically,  17  =  1{D(0  T)  <  1}  for 
the  case  .4  =  10,  r\  —  1{D(^G)  <  1.2}  for  the  case  A  =  20, 
and  tf  —  l{D(<f)G )  <  L5}  for  the  case  A  —  30;  and,  finally, 
vi)  7i /  denotes  the  number  of  times  the  corresponding  algorithm 
evaluates  the  function  /( ■)  during  its  course. 

Table  I  reports  results  for  the  function  in  (23)  and  Table  II 
(left)  reports  results  for  the  function  in  (23)  when  we  rotate  the 
coordinate  system  by  60°  with  respect  to  all  six  axes.  In  each  of 
these  tables  we  considered  three  problem  instances  for  A  equal 
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TABLE  II 

Left:  CGU  and  SDU  Applied  to  the  Test  Function  in  (23)  When  n  =  6  and  the  Coordinate  System  is  Rotated  by  60~  With  Respect  to  all  Six 

Axes.  Right:  SDU  Applied  to  the  1 BRS  (Top)  and  1MLC  (Bottom) 


Method 

ASA 

CGU 

SDU 

~ ASA  |  CGU  ■  SDU 

ASA  f  CGU  SIX! 

A 

10 

20 

30 

3.506 

0.456 

11.528 

0,731 

11,727 

1,035 

fjL  . 

2.54$ 

0.184 

25.15 

0.302 

20.629 

0,400 

3.306 

0.659 

4.722 

0.829 

8.614 

1.259 

4>n 

1.736 

0.639 

2.991 

0.440 

13.424 

0661 

f(4>0  ) 

50.769 

25.649 

109.156 

66,150 

415,843 

95.0605 

27.153 

12,260 

82.185 

23,013 

1669  64 

29.169 

*c; 

W)  1 

2.66 

1,837 

0,490 

3,53 

2.539 

0.799 

4,16 

3,013 

1.128 

0.92 

0.672 

0.322 

1,04 

0.952 

0,509 

U3 

LI  17 

0.698 

mn 

-0.87 

8.108 

17.006 

9.01 

26.319 

49.985 

19.53 

42.614 

75.172 

7.48 

5.19 

10,46 

10,5 

9,45 

t7.0] 

12,28 

12.  is 

22.28  r 

am. 

0 

7 

93 

0 

3  86 

0 

4 

S4 

n, 

&6IH6  ' 

'  230.36  ' 

91.89 

81500 

229.84  '  85.95 

82273  ' 

246.74 

fw.77 

i 

<p? 

f(4>u) 

P(4>u) 

1 

-0.195 

-2.716 

0.702 

-65.58 

2.82 

2 

-0.403 

0.282 

-0.197 

-102.25 

0.53 

i 

0? 

4> 2 

0? 

mu) 

D(4>u) 

1 

-0.831 

*0.935 

1.428 

-24.30 

1.90 

2 

■0.83  L 

-0.935 

1.428 

-24.30 

1.90 

3 

0,056 

-0,444 

-0,550 

-64,57 

0,70 

4 

0,056 

-0.444 

-0.550 

-64.57 

0.70 

5 

-0,144 

0.764 

0.059 

-69.29 

0.78 

6 

•0.144 

0.764 

0.059 

-69.29 

0.78 

7 

-0.033 

0.549 

0.287 

■71.19 

0.62 

to  10, 20,  and  30,  respectively.  For  each  case,  we  performed  100 
independent  runs  of  both  CGU  and  SDU.4  The  various  values 
we  report  are  the  means  of  the  corresponding  quantities  over  the 
100  independent  runs.  We  also  report  the  corresponding  stan¬ 
dard  deviations;  ax  denotes  the  standard  deviation  of  X. 

We  should  note  that  the  most  important  performance  metric 
is  the  closeness  of  the  conformation  found  to  the  best  one  (mea- 
sured  by  D{<$> 7)  and  77).  There  are  a  number  of  reasons  for  this 
preference,  including  that  i)  the  3-D  structure  of  the  bound  com¬ 
plex  determines  its  function,  ii)  molecular  simulation  may  be 
used  to  discover  the  native  conformation  starting  from  a  confor¬ 
mation  dose  to  it,  and  iii)  the  energy  evaluation  models  used  by 
molecular  docking  algorithms  are  approximations  of  the  actual 
free  energy  function.  As  a  result,  a  conformation  closer  to  the 
native  one  (like  the  ones  SDU  discovers)  is  more  useful  than 
some  deep  local  minima  further  away  (like  the  ones  CGU  may 
discover). 

For  the  cases  where  the  test  function  is  not  rotated,  CGU, 
not  surprisingly,  performs  reasonably  well.  SDU  produces  con¬ 
formations  with  somewhat  higher  energy  but  closer  to  the  na¬ 
tive  one.  The  performance  difference  between  SDU  and  CGU 
is  more  dramatic  in  the  cases  where  the  test  function  has  been 
rotated.  In  the  three  cases  considered  CGU  fails  to  discover  a 
conformation  close  enough  to  the  native  one  while  SDU  largely 
succeeds  (observe  the  differences  in  /;).  The  key  reason,  we  be¬ 
lieve,  is  that  SDU  can  better  capture  the  structure  of  the  native 
funnel,  which  is  used  to  bias  sampling  in  that  area.  This  latter 
point  is  perhaps  more  clearly  illustrated  in  Fig.  5.  We  draw  his¬ 
tograms  of  the  distance  of  the  predictive  conformation  during 
the  evolution  of  both  algorithms.  Histograms  k  =  1 . (i  ap¬ 

pearing  from  top  to  bottom  correspond  to  (j> ^  obtained  at  the  Arth 
iteration.  The  horizontal  axis  plots  the  values  of  D{4>[  )  ( D{$G ) 
for  the  bottom  graph)  and  the  vertical  axis  plots  the  percentage 
of  problem  instances  (out  of  the  100  runs).  It  can  be  seen  that 
SDU  starts  with  a  much  better  initial  predictive  conformation 
than  CGU  and  maintains  its  predictive  conformation  within  2 
units  of  the  native  one  at  all  subsequent  iterations.  The  best  local 
minimum  found  by  SDU  is  not  always  within  1  unit,  but  in  84% 
of  the  cases  it  was  within  1 .5  unit  (versus  4%  with  CGU). 

We  also  compared  SDU  and  CGU  to  a  commonly  used 
simulated  annealing  algorithm,  the  ASA  algorithm  in  [22]  (see 

4Note  that  for  each  ran  a  number  of  random  samples  in  B  are  been  generated. 


Table  II).  Even  though  ASA  reaches  some  deep  local  minima  it 
ends  up  converging  far  away  from  the  global  minimum. 

In  terms  of  computational  efficiency  and  running  time,  CGU 
and  ASA  take  on  the  order  of  a  few  seconds  in  a  top-of-the-line 
PC  and  SDU  takes  on  the  order  of  5  minutes  without  partic¬ 
ular  effort  at  optimizing  the  code.  The  key  reason  is  that  SDU 
solves  a  number  of  (relatively  large)  SDP  problems.  However, 
SDU  performs  much  fewer  function  evaluations  (by  a  factor  of 
3  compared  to  CGU  and  by  a  factor  of  1000  compared  to  ASA 
as  it  is  evident  by  the  nj  values  in  Table  II).  For  the  test  func¬ 
tions  we  considered  in  this  section  the  function  evaluation  time 
is  negligible  using  (23).  Yet,  in  docking  actual  proteins  function 
evaluations  (e.g.,  using  CHARMM  [23]  or  other  complex  po¬ 
tentials)  dominate  all  other  tasks;  for  instance,  in  the  results  we 
report  in  the  next  section  function  evaluations  take  more  than 
90%  of  the  running  time. 

VII.  Applications  in  Rigid-Body  Protein  Docking 

In  this  section,  we  report  results  from  applying  SDU  to  a 
number  of  protein-to-protein  docking  instances.  We  assume  that 
both  proteins  are  rigid  bodies.  This  is  certainly  true  for  their 
backbones  and,  as  we  will  see,  we  will  allow  some  flexibility 
in  determining  the  conformation  of  the  side  chains  in  the  inter¬ 
face  between  the  two  proteins.  We  consider  3-D  problems  where 
one  protein  is  held  fixed,  both  proteins  are  oriented  based  on  in¬ 
formation  we  obtain  from  the  known  bound  structure,  and  we 
seek  to  determine  the  position  of  the  second  protein  in  the  bound 
structure  with  respect  to  the  first  one.  We  constrain  these  three 
translational  variables  to  be  in  a  10A  cube  centered  around  the 
origin  of  the  coordinate  system,  which  corresponds  to  the  na¬ 
tive  position  of  the  ligand.  That  is,  we  have  already  identified 
the  dominant  funnel  of  the  energy  function  and  seek  to  solve  the 
medium-range  energy  minimization  problem  within  this  funnel. 

The  (free)  energy  functional  we  minimize  includes  van  der 
Waals  interactions  (Ai?V(iw),  the  desolvation  energy  (A E^) 
and  the  electrostatic  energy  (A£eiet,),  We  use  the  software 
package  CHARMM  [23]  to  evaluate  the  free  energy  and  to 
perform  local  minimization.  We  allow  side-chains  to  be  flex¬ 
ible  during  the  local  minimization  phase;  this  is  critical  as 
side-chains  in  the  interface  between  the  two  proteins  can  not  be 
considered  rigid  and  they  are  packed  in  a  way  that  minimizes 
the  overall  free  energy  of  the  bound  complex. 
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Fig,  5.  Histograms  of  predictions  and  final  solutions  reported  by  CGU  and  SDli  for  the  case  corresponding  to  A  —  30  in  Table  II  (left). 


TABLE  III 

SDU  Applied  to  2 PTC 


i 

-■i' 

<P2 

mu) 

1 

1.001 

-2.555 

1 .395 

-25.98 

3.08 

2 

1.307 

-2.369 

1 .363 

-28.77 

3,03 

3 
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1.310 
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4 
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2.82 
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2.56 

6 

am 
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-46.37 

1.58 

7 

-a  182 

-0.587 

0.529 

-49.98 

0.81 

8 

0.170 

0.669 

*0,387 

-53,93 

0.82 

In  Table  II  (right)  and  Table  III  we  report  results  for  the  com^ 
plexes  1BRS  (barnase/barstar),  1MLC  (a  monoclonal  antibody 
and  lyzozyme  complex),  and  2PTC  (a  trypsin- inhibitor  com¬ 
plex).  The  bound  structure  for  each  case  has  the  ligand  centered 
at  the  origin,  that  is,  the  optimal  solution  is  (0,  0,  0).  The  ini¬ 
tial  search  region  B  is  a  10A  cube.  In  each  iteration  (indexed  by 
i  in  the  tables),  we  report  the  best  structure  found  so  far, 
the  corresponding  energy  f($0  )  (in  kcal/mol),  and  the  RMSD 
distance  D(4*C)  from  the  native  structure  (inA).  As  discussed 
earlier,  the  key  measure  of  success  is  the  proximity  of  D(t}>  J  ) 
to  zero. 

The  evolution  of  SDU  in  all  three  complexes  considered  is 
depicted  in  Fig.  6.  A  better  illustration  is  given  in  Fig.  7  for  the 
IBRS  complex.  In  both  figures  we  plot  the  fixed  receptor.  We 
generated  28  random  positions  of  the  ligand  in  the  initial  search 
region  and  in  the  left  figure  we  plot  a  sphere  at  the  center  of 
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iteration  iteration 


Fig.  6.  Evolution  of  SDU  for  the  IBRS,  1MLC,  and  2PTC  complexes. 

the  ligand  for  each  one  if  these  28  ligand  positions.  Spheres  in 
lighter  shades  of  gray  correspond  to  lower  energy  conforma¬ 
tions.  The  position  of  the  native  ligand  is  shown  as  a  dotted 
sphere  and  pointed  by  an  arrow.  After  performing  the  first  it¬ 
eration  of  SDU  we  sample  a  new  set  of  ligand  positions  using 
the  procedure  in  SDU’s  Step  5  (b)  (cf.  Fig.  3).  The  centers  of 
these  ligands  are  depicted  in  the  right  figure.  The  corresponding 
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Fig.  7.  Illustration  of  the  1BRS  docking  procedure. 


structures  are  both  very  close  to  the  native  one  and  have  low 
energies. 

We  conclude  by  noting  that  the  starting  point  for  our  medium- 
range  optimization  is  the  cluster  of  best  conformations  produced 
by  the  techniques  in  [10].  As  seen  in  Fig.  6,  these  produce  con¬ 
formations  within  3-5  A  from  the  native  one.  Our  method  is  able 
to  reduce  this  distance  to  less  than  1A  in  all  three  complexes 
considered.  In  most  cases  this  is  considered  a  very  successful 
identification  of  the  native  conformation  as  the  (approximate) 
energy  models  used  do  not  provide  a  much  greater  degree  of 
accuracy. 


VIII.  Conclusion 

We  introduced  a  new  method  for  minimizing  free-energy 
functions  appearing  in  molecular  docking  and  other  important 
problems  in  computational  structural  biology.  These  energy 
functions  are  notoriously  hard  to  minimize  as  they  consist  of 
terms  acting  in  disparate  space- scales  and  have  a  huge  number 
of  local  minima.  Yet,  in  certain  areas  they  exhibit  a  funnel-like 
shape  which  we  use  to  our  advantage. 

Our  method  works  on  the  surface  spanned  by  local  minima. 
We  developed  a  technique  based  on  semidefinite  programming 
to  form  a  general  convex  underestimator  of  the  energy  function. 
The  underestimator  guides  random,  yet  biased,  exploration  of 
the  energy  landscape.  We  established,  theoretically  as  well  as 
numerically,  that  our  method  performs  better  than  an  alternative 
(CGU)  method.  Finally,  we  applied  our  algorithm  to  some 
protein-protein  docking  problems  and  showed  that  the  resulting 
conformation  is  extremely  close  to  the  native  one  (within  1A 
RMSD).  This  improves  upon  the  3-5 A  RMSD  that  current 
state-of-the-art  methods  produce. 
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